Rank | Count | Beginning |
---|---|---|
323 | 1674 | 2. |
7065 | 1537 | Bu |
2082 | 1019 | 3. |
26734 | 1002 | Türkmenistanyň |
24043 | 676 | Şu |
3142 | 616 | 4. |
22036 | 550 | Şeýle |
13959 | 536 | Hormatly |
9827 | 500 | Döwlet |
17370 | 434 | Milli |
3778 | 366 | 5. |
24925 | 361 | Şunuň |
23433 | 337 | Soňra |
10725 | 326 | Eger |
22899 | 315 | Şol |
18891 | 297 | Ol |
13453 | 266 | Häzirki |
26248 | 262 | Türkmen |
18908 | 261 | Olar |
18190 | 253 | Munuň |
4164 | 228 | 6. |
20045 | 225 | Onuň |
19240 | 214 | Olaryň |
24649 | 204 | Şunda |
29450 | 163 | Ýurdumyzyň |
26443 | 157 | Türkmenistan |
18051 | 136 | Mundan |
4397 | 132 | 7. |
9703 | 122 | Dowamyny |
6824 | 119 | Biziň |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV